Design and implementation of a massively parallel version of DIRECT

نویسندگان

  • Jian He
  • Alex Verstak
  • Layne T. Watson
  • Masha Sosonkina
چکیده

This paper describes several massively parallel implementations for a global search algorithm DIRECT. Two parallel schemes take different approaches to address DIRECT’s design challenges imposed by memory requirements and data dependency. Three design aspects in topology, data structures, and task allocation are compared in detail. The goal is to analytically investigate the strengths and weaknesses of these parallel schemes, identify several key sources of inefficiency, and experimentally evaluate a number of improvements in the latest parallel DIRECT implementation. The performance studies demonstrate improved data structure efficiency and load balancing on a 2200 processor cluster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fully Distribute Parallel Global Search Algorithm

The n-dimensional direct search algorithm DIRECT of Jones, Perttunen, and Stuckman has attracted recent attention from the multidisciplinary design optimization community. Since DIRECT only requires function values (or ranking) and balances global exploration with local refinement better than n-dimensional bisection, it is well suited to the noisy function values typical of realistic simulation...

متن کامل

Performance Modeling and Analysis of a Massively Parallel Direct - Part 2

Modeling and analysis techniques are used to investigate the performance of a massively parallel version of DIRECT, a global search algorithm widely used in multidisciplinary design optimization applications. Several high-dimensional benchmark functions and real world problems are used to test the design effectiveness under various problem structures. In this second part of a two-part work, the...

متن کامل

Design and Implementation of a High Speed Systolic Serial Multiplier and Squarer for Long Unsigned Integer Using VHDL

A systolic serial multiplier for unsigned numbers is presented which operates without zero words inserted between successive data words, outputs the full product and has only one clock cycle latency. &#10The multiplier is based on a modified serial/parallel scheme with two adjacent multiplier cells. Systolic concept is a well-known means of intensive computational task through replication of fu...

متن کامل

Design and Implementation of a High Speed Systolic Serial Multiplier and Squarer for Long Unsigned Integer Using VHDL

A systolic serial multiplier for unsigned numbers is presented which operates without zero words inserted between successive data words, outputs the full product and has only one clock cycle latency. The multiplier is based on a modified serial/parallel scheme with two adjacent multiplier cells. Systolic concept is a well-known means of intensive computational task through replication of func...

متن کامل

Refactoring the UrQMD model for many-core architectures

Ultrarelativistic QuantumMolecular Dynamics is a physics model to describe the transport, collision, scattering, and decay of nuclear particles. The UrQMD framework has been in use for nearly 20 years since its first development. In this period computing aspects, the design of code, and the efficiency of computation have been minor points of interest. Nowadays an additional issue arises due to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comp. Opt. and Appl.

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2008